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The  present  study  was  designed  to  look  at  the  effects  of  adding 
quantitative  and  qualitative  data  to  a  relevant  clinical  judgment 
task.   In  essence,  it  compared  judges  with  varying  degrees  of  clini- 
cal experience  to  actuarial  prediction  methods.   The  study  also 
attempted  to  train  judges  to  use  actuarial  information  to  improve 
their  prediction  accuracy. 

Twelve  judges  representing  three  levels  of  clinical  experience 
made  post-dictive  judgments  on  the  length  of  stay  in  psychotherapy 
(short  or  long)  from  a  sample  of  MMPI  profiles  of  clients  seen  in  a 
university  mental  health  service.   Judgments  were  made  under  four 
conditions  in  which  qualitative  and  quantitative  information  was 
added  incrementally  at  each  level.   The  three  levels  of  judges'  ex- 
perience were  professional  clinical  psychologists,  "sophisticated" 
third  year  clinical  psychology  graduate  students  trained  in 


statistical  decision  theory,  and  "unsophisticated"  third  year  clini- 
cal psychology  graduate  students  without  any  training  in  statistical 
decision  theory. 

Accuracy  increased  over  levels  of  information  but  there  were  no 
differences  in  accuracy  for  the  three  levels  of  experience.   A  sig- 
nificant group  by  information  level  interaction  demonstrated  some 
group  effects  due  to  a  lower  proportion  of  correct  judgments  for  the 
less  experienced  judges  under  conditions  involving  the  least  amount 
of  information. 

Judges  became  more  confident  in  their  judgments  as  they  received 
more  information.   Appropriateness,  defined  as  accuracy  weighted  by 
confidence  and  measured  by  correlation  coefficients  between  accuracy 
and  confidence,  increased  substantially  as  increments  of  information 
were  added.   The  group  trained  in  statistical  decision  theory  tended 
to  make  the  most  appropriate  judgments  and  the  least  experienced 
group  of  graduate  students  tended  to  make  the  least  appropriate  judg- 
ments. 

The  present  study  showed  that  clinicians  can  use  quantitative 
data  to  improve  their  own  judgmental  ability  and  to  predict  more 
accurately  than  actuarial  data  alone.   Also,  since  those  judges  with 
the  most  experience  in  using  actuarial  tasks  tend  to  be  the  most 
appropriate  in  their  judgments,  this  implies  that  clinicians  can  also 
be  trained  to  be  more  appropriate  and  to  know  when  their  judgments 
are  more  likely  to  be  accurate. 


INTRODUCTION 

The  objectives  of  the  present  study  were  two-fold.   The  primary 
objective  was  to  examine  the  effects  of  test  and  non-test  (statistical) 
information  on  the  judgmental  process.   The  secondary  objective,  end 
the  focus  for  implementing  the  primary  objective,  was  to  study  the 
psychological  attributes  of  individuals  who  stay  only  a  short  time  in 
therapy  versus  those  wh,o  remain  a  long  time.   That  is,  the  objective 
of  studying  the  judgmental  process  was  couched  in  a  real  end  reievant 
situation,  length  of  stay  in  psychotherapy,  which  is  a  meaningful  and 
pressing  problem  for  psychologists  today.   However  this  secondary 
objective  v.£:5  minor  in  relation  to  the  major  issue  of  e;<amining  thp 
clinical  judgment  process. 

CI  i  n :  ca  1  Vers y s  Ac  t ua  rial  Predjct.ip.i 

Ever  since  Paul  ileehl's  book,  Clinical  Versus,  Statistical,  Pre- 
d.i_ctj_cjn,  clarified  tlie  issue  of  clinicians'  predictions  versus 
actuarial  predictions,  there  have  been  numerous  studies  comparing 
these  two  prediction  methods.   As  Keehl  (195^0  points  out,  however, 
the  tvjo  methodi  need  not  be  mutually  exclusive  since  the  clinician  can 
incorporate  actuarial  methods  and  data  into  his  prediction  process. 
Many  studies  have  focused  not  only  on  comparing  clinicians  to  statis- 
tical fcrnulae  but  also  on  improving  the  clinician's  ability  to 
predict  by  giving  him  useful  statistical  i  nf  orrr.a  t  ion  and  training  him 
to  use  this  i  nfcrn-.at  ion. 

In  general,  the  studies  vshich  compared  clinicians  to  actuarial 


methods  found  that  the  actunrial  methods  were   either  superior  to  clin- 
icians or  equal  in  efficiency  to  clinicians  (Meehl,  1965).   V/ith  the 
exception  of  one  study,  the  clinician  has  shown  no  superiority  to 
purely  quantitative  actuarial  prediction.   The  one  study  v.'hich  did 
find  clinicians  superior  (Lindzey,  ly65)  used  one  to  two  clinicians 
and  its  application  is  sonewhat  questionable.   One  reason  the  clin- 
ician has  not  been  superior  to  actuarial  methods  is  that  he  has  seldom 
been  given  the  opportunity  to  incorporate  the  actuarial  information  in 
formulating  his  final  decision.   He  has  been  at  a  disadvantage  so  that 
the  demonstrated  superiority  of  the  actuarial  method  may  be  due  to  the 
experimental  design  rather  tl'an  to  an  actual  superiority  of  statistical 
techniques.   Also,  the  information  available  to  the  clinician  has  often 
been  based  on  non-quantitative  data  such  as  interview  material,  case 
history  data,  and  projective  tests. 

Holtzman  (1S6C)  separates  the  clinician's  diagnostic  task  into 
three  phases:   ( I ) col  1 ect i on  of  information;  (2) preparat i on  and  trans- 
lation of  this  information  for  analysis;  (3) i nterpretat ion  of  this 
information.   As  he  points  out,  actuarial  methods,  and  specifically 
the  computer,  ore   superior  to  the  clinician  in  processing  information 
once  the  primary  coding  has  been  done.   The  clinician  is  still 
superior  at  collecting  information  and  at  interpreting  it  because  at 
present  the  computer  lacks  the  appropriate  rules  and  parameters  for 
interpretation.   Thus,  studies  which  emphasize  aspects  of  prediction 
suitable  for  actuarial  methods  do  not  use  the  clinician's  talents  to 
best  advantage.   It  is  when  skilled  clinicians  use  familiar  methods 
to  predict  a  criterion  they  know  something  about  that  they  have  the 


most  success  (h'olt,  1958).   This  includes  their  having  a  rich  body  of 
data  and  systematic  actuarial  procedures  at  their  disposal  in  addition 
to  their  own  experience,  intuition,  and  knowledge. 

Recent  studies  suggest  that  as  the  amount  of  clinical  experience 
increases,  prediction  accuracy  decreases  (Goldberg,  1959;  Oskamp, 
1962;  Shaqoury,  19^9;  Shagoury  &  Satz,  1969).   These  studies  compared 
trained  clinicians  with  a  professional  degree  to  clinical  psychology 
graduate  students  end  even  to  non-professional  groups,  such  as  secre- 
taries, and  have  found  that  the  trained  clinicians  were  not  superior 
to  the  other  groups.   An  explanation  of  this  finding  is  that  the  more 
experienced  clinician  har,  developed  a  particular  way  of  looking  at 
data  which  interferes  with  his  making  unbiased,  objective  decisions. 

Another  aspect  of  research  in  the  area  of  clinical  versus  statis- 
tical predictions  is  the  confidence  clinicians  place  in  their  judg- 
ments and  the  appropriateness  of  their  predictions.   Appropriateness 
Is  a  measure  of  confidence  weighted  by  accuracy  which  was  developed 
by  Adams  (1957).   Confidence  in  judgments  also  differs  between  groups 
of  graduate  students  and  trained  cl i nic ians,  wi th  the  trained  psychol- 
ogists being  less  confident  in  their  judgments  (Goldberg,  1959;  Oskamp, 
1962).   When  the  measurement  of  appropriateness  of  the  judgment  is 
introduced,  however,  the  trained  clinicians  are  more  appropriate  in 
their  confidence  levels  than  are  either  graduate  students  or  non- 
professionals (Oskamp,  19^2;  Shagoury,  I969).   That  is,  clinicians 
are  more  confident  of  their  correct  decisions  and  less  confident  of 
their  incorrect  decisions.   The  amount  of  information  available  to 
the  judge  does  not  correlate  with  his  predictive  acc-.iracy  but 


increased  amounts  of  information  substantially  increase  confidence 
levels  (Goldberg, 1968). 

Goldberg  (I968)  also  discusses  the  nature  of  the  judgmental 
process.   He  questions  vjhether  judges  use  simple  decision-making 
models  such  as  linear  models,  or  complex  processes  such  as  configural 
models.   In  an  analysis  of  clinician's  judgments  he  found  that  a 
linear  model  usually  reproduced  90  to  100  per  cent  of  the  reliable 
judgmental  variance  on  most  decision-making  tasks  even  though  the 
clinicians  generally  felt  that  they  used  more  complex,  configural 
model s. 

Using  Statistical  Information  to  Increase  Prediction  Accuracy 

Training  in  the  use  of  statistical  information  has  been  shown 
to  improve  judgmental  accuracy.   in  a  study  by  Oskamp  (I962),  clini- 
cians 'vicre   able  to  improve  their  ability  to  distinguish  psychiatric 
and  medical  patients  on  the  basis  of  their  Minnesota  Multiphasic 
Personality  Inventory  (MMPI)  profiles  vjhen' they  were  provided  with 
actuarial  rules.   Statistical  formula  predicted  with  75  per  cent  accur- 
acy and  the  clinicians,  after  training,  were  able  to  reach  this  75 
per  cent  accuracy  level. 

Goldberg  {ISGS)    trained  judges  by  giving  them  a  formula  and 
optimum  cutting  score  for  distinguishing  neurotic  form  psychotic  MMPi 
profiles.   The  judges  were  told  that  the  statistical  information 
predicted  v;ith  70  per  cent  accuracy  and  they  \-iere   encouraged  to  use 
this  information  along  v/ith  any  other  information  they  thought  would 
improve  their  prediction  accuracy.   Goldberg  found  that  after  eight 


weeks  of  "value  training,"  the  judges,  on  the  average,  increased  theii 
accuracy  from  between  52  per  cent  to  65  per  cent  to  approximately  70 
per  cent.   This  was  the  only  type  of  training  that  substantially  im- 
proved accuracy.   Thus,  feedback  is  necessary  if  the  clinician  is  to 
learn  how  to  ir.iprove  his  decision-making  techniques. 

Another  useful  type  of  statistical  information  is  the  incidence, 
or  base  rate,  of  a  given  trait  in  the  population  available  to  the 
clinician.   Goldberg  (1959),  for  example,  had  judges  predict  brain- 
damaged patients  from  functional  patients  on  the  basis  of  Bender- 
Gestalt  protocols.   The  protocols  were  randomized  into  different 
groups  in  which  the  incidence  of  b  ra  i  n-da'nage  varied  from  high  (£=.8) 
to  low  (£=.2).   Goldberg  found  no  difference  in  judgmental  accuracy 
between  these  groups.   Unfortunately,  the  base  rate  information  was 
not  provided  to  the  judges. 

The  importance  of  base  rates  for  evaluating  predictive  te-jts  was 
discussed  by  Meehl  and  Rosen  (1955).   They  cite  as  an  example  an  Army 
adjustment  test  for  predicting  vjhich  inductees  would  adjust  to  the 
service.   The  test  predicted  inductee  adjustment  with  an  accuracy  of 
79.7  per  cent.   However,  the  overall  percentage  of  inductees  who 
adjusted  was  95  per  cent;  thus,  utilization  of  the  base  rates  alone 
(i.e.,  predicting  adjustment  in  all  cases)  would  result  in  a  hit  rate 
of  95  per  cent. 

Another  application  of  base  rates  is  through  Bayesian  statisti- 
cal theory  which  combines  the  base  rates  with  the  valid  and  false 
positive  rates  of  a  particular  test  to  give  a  conditional  probability 
for  the  likelihood  of  being  correct  or  incorrect  given  a  certain  test 


sign  in  a  given  base  rate  population. 

Shagoury  (1963)  and  Shagoury  and  Satz  (I969)  demonstrated  that 
clinicians  can  substantially  improve  their  predictive  accuracy  when 
provided  vjith  information  on  base  rates  and  conditional  probabilities. 
These  studies  sho\-;ed  that  increments  in  statistical  i  nformat  ion, added 
to  test  data,  significantly  increased  the  accuracy  of  judges  in  a 
real-life  clinical  decision  task  of  predicting  brain-damaged  end 
functional  patients  on  the  basis  of  a  block  rotation  task  (Satz,  I966). 
Their  judges'  accuracy  approximated  that  obtained  by  a  discriminant 
function  predictor  score  (Z.) .   Composite  Z.  scores  were  de-emphasized 
by  the  judges  in  favor  of  using  the  additional  information  such  as 
the  base  rates,  differential  error  risks,  end  conditional  probabil- 
ities.  Hovjever,  in  groups  with  a  high  incidence  of  brain-damaged 
individuals  (base  rate=.8)  the  judges'  overall  accuracy  decreased, 
perhaps  due  to  a  reluctance  to  diagnose  pathology. 

Mechl  and  Rosen  (1955)  point  out  that  test  development  should  be 
concentrated  on  populations  with  base  rates  near  .50  rather  than  on 
populations  with  base  rates  approaching  .00  or  I. 00  since  the  use  of 
a  test  in  the  latter  cases  will  lower  the  hit  rate  of  using  the  base 
rates  alone. 

A  cutting  score,  or  composite  Z_  score,  derived  from  discriminant 
function  analysis  can  be  manipulated  for  various  purposes  in  predic- 
tion.  It  can  be  used  to  maximize  the  number  of  correct  predictions 
for  all  cases  or  for  maximizing  only  correct  predictions  for  positives. 

An  excellent  application  of  this  teclinique  of  discriminant  func- 
tion analysis  to  decision  theory  in  a  clinical  setting  was  demonstrated 


by  Satz  (I966).   Discriminant  function  analysis  is  a  statistical  tech- 
nique devised  to  maximally  differentiate  discrete  criterion  groups 
when  multiple  measurements  are  involved.   This  is  essentially  e   multi- 
ple regression  technique  occept  for  a  discontinuous  distribution  on 
the  criterion  variables.   The  follovjing  linear  equation  expresses  this 
funct  ion: 

i-^\\  ^   \h^ -  w 

where  Z_  is  the  composite  predictor  score  based  on  the  individual  scores 
on  each  of  the  variables  (Xj ,  X2i..-iXp)  and  the  respective  weights, 

or  lambdas,  assigned  to  each  of  the  variable  scores  (X),  \j ^n^ ' 

If  there  arc  two  criterion  groups  involving  multiple  measures,  the 
discriminant  function  determines  optimal  v^jeights  (lambdas)  for  these 
variables  which  v;ill  maximize  the  difference  betv^een  the  composite  Z 
scores  on  botti  criterion  groups. 

Length _pf  Stay  in  Psychotherapy  as  a  Criterion  Variable 

Why  is  length  of  stay  in  psychotherapy  a  meaningful  problem  for 
study?   First,  there  is  the  great  demand  for  psychological  services 
with  a    present-day  manpower  shortage  of  trained  clinicians.   Host 
clinics  that  see  individuals  with  psychological  problems  are  under- 
staffed, have  pat ient  wa i t ing-1 ists ,  or  both.   There  are   also  differ- 
ential risks  involved  in  selecting  who  will  be  seen  in  therapy.   It 
is  far  more  serious  to  miss  those  who  are   severely  disturDeo  and  need 
long-term  psychotherapy  because  of  the  threat  these  individuals  may 
pose  to  themselves  or    to  society,  than  it  is  to  wrongly  classify 
persons  who  need  only  a  fevv  sessions  and  are   experiencing  minor 


difficulties  in  their  lives.   The  first  type  of  error,  that  of  pre- 
dicting a  short  stay  in  therapy  based  on  a  negative  test  score  when 
in  fact  the  person  stays  a  long  time,  is  a  false  negative  error.   A 
false  positive  error  results  from  the  prediction  of  therapy  sessions 
based  on  a  positive  test  score  when  the  individual  actually  stays 
only  a  few  sessions. 

Meehl  and  Rosen  (1955)  point  out  that  often  in  a  clinical  set- 
ting external  restraints  are  imposed,  perhaps  due  to  a  shortage  of 
staff  time,  patient  vja  i  t  i  ng- 1  i  sts  ,  or  administrative  policy.   If  this 
is  the  case,  decisions  cannot  always  be  made  in  accordance  with  known 
base  rates.   They  give  the  following  example  to  illustrate  the  use  of 
an  externally  imposed  selection  ratio.   if  80  per  cent  of  the  patients 
referred  to  a  mental  health  clinic  are  recoverable  v;ith  intensive 
psychotherapy,  then  everyone  should  be  treated  rather  than  relying  on 
a  test  which  predicts  only  75  per  cent  of  those  who  will  have  a  favor- 
able therapy  outcome.   However,  if  staff  time  is  limited  and  only  half 
of  the  referrals  can   be  treated,  following  the  base  rates  is  meaning- 
less because  this  would  lead  to  a  decision  that  would  be  impossible 
to  implement.   In  this  case,  where  a  selection  ratio  of  .5  is  exter- 
nally imposed,  the  use  of  the  test  becomes  worthwhile.   Given  the 
figures  in  Table  1  (Keehl  &  Rosen,  1955).  those  50  cases  out  of  the 
100  referrals  to  be  treated  are   selected  from  those  individuals  the 
test  predicts  will  be  "good"  therapy  risks.   If  this  is  done  there  is 
a  92.3  per  cent  hit  rate  among  those  selected  for  therapy  (6O/65) . 
Stated  another  way,  the  test  will  be  correct  in  ^6  out  of  the  50 
cases  which  v.ill  be  successes  (half  of  the  80  good  therapeutic 


Table  1 
Actual  and  Test-Predicted  Therapeutic  Outcome 


Test 
Predict  ion 


Therapeutic  Outcome 
Good  Poor  Total 


Good 


Poor 


60 
20 


65 
35 


Total 


80 


20 


100 


10 


outcome  group) . 

A  second  reason  for  selecting  length  of  stay  in  psychotherapy  as 
the  focus  for  a  clinical  judgment  study  is  that  the  probler.i  can  be 
subjected  to  multivariate  and  statistical  decision  theory  analysis  in 
order  to  increase  the  predictive  relationship  bet'ween  signs  and  cri- 
teria.  This  possibility  thus  increases  its  application  and  potential 
usefulness  to  clinical  judges. 

One  study  in  this  area  found  that  there  are  differences  in  be- 
havior in  psychotherapy  betv.een  individuals  which  are  predictable 
from  an  HHPI  profile  (Mello  &  Guthrie,  1958).   Kello  and  Guthrie 
studied  219  individuals  seen  at  a  college  psychological  clinic.   They 
used  only  those  profiles  with  at  least  one  T  score  greater  than  70. 
They  found  that  length  of  stay  in  therapy  was  related  to  high  scores 
on  various  scales  of  the  HNPI.   Of  those  students  with  high  scores 
on  Scale  2(D),  kS    per  cent  remained  only  one  to  three  sessions.   Per- 
sons high  en  Scale  3(Hy)  tended  to  stay  in  therapy  longer  than  the 
high  2's  and  also  developed  dependency  on  tiie  therapists  more  easily. 
Scale  ^(Pd)  individuals  seldom  stayed  past  seven  counseling  sessions 
and  as  a  group  iMere  quite  resistant  to  therapy  although  they  did  not 
often  cancel  their  appointments.   Persons  who  stayed  the  longest  in 
therapy  were  high  on  Scales  7(Pt)  or  8(Sc)  with  some  clients  contin- 
uing past  60  and  21  sessions  respectively  for  these  two  scales.   Most 
of  the  high  S(Ma)  students  stayed  fewer  than  11  sessions  and  cancelled 
therapy  sessions  frequently.   Mello  and  Guthrie  concluded  thet  a 
therapist  can  get  seme  idea  of  what  to  expect  from  a  particular  client 
on  ti.e  basis  of  h^s  MMPI  profile. 


n 


The  flello  and  Guthrie  study  is  interesting  because  it  suggests 
that  psychological  data  (MMPl)  may  be  used  by  clirticians  to  more 
efficiently  select  clients  for  psychotherapy.   Unfortunately,  the 
authors  did  not  examine  this  problem  within  the  context  of  a  decision- 
making task  nor  did  they  subject  their  data  to  multivariate  analysis. 

Using  length  of  stay  in  psychotherapy  as  the  predictor  criterion 
is  valuable  for  other  reasons.   For  the  professional  involved,  it  may 
clarify  the  services  offered  by  his  agency  and  help  hir,i  to  provide 
more  adequate  services  to  his  clients.   For  example,. he  may  decide 
that  seeing  many  clients  for  a  short  period  of  time  is  of  more  value 
tiian  giving  those  who  need  long-term  therapy  this  service  and  thus 
seeing  fevJer  clients.   That  is,  prevention  may  be  emphasized  in  a 
college  mentai  health  clinic  and  such  a  clinic  may  be  designed  to  see 
as  many  students  as  possible  to  ease  their  transition  from  high  school 
or  junior  college  to  a  college  curriculum.   On  the  other  hand,  a 
clinic  nay  be  more  treatment  oriented  and  seek  to  help  those  vjho  are 
more  disturbed  and  require  longer  therapy.   This  emphasis  would  re- 
quire more  staff  time  per  individual  client  and  would  necessitate 
seeing  fev.'cr  clients.   Decisions  of  whom  to  treat  could  be  more  ade- 
quately made  with  test  and  non-test  information. 

To  be  able  to  predict  length  of  stay  in  therapy  could  affect 
therapist  expectations  which  could  in  turn  affect  outcome  variables. 
Just  what  effect  an  expectation  for  a  particular  length  of  stay  in 
therapy  will  have  on  the  outcome  of  the  therapy  is  outside  the  scope 
of  the  present  study  but  is  an  important  research  question  in  itself. 
Of  couf-se,  if  the  clinician  intends  to  see  ever/one  who  enters  his 
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clinic,  a  screening  procedure  is  vjorthless  or  may  even  be  detrimental 
if  the  test  predicts  that  an  individual  will  not  stay  in  therapy  or 
will  not  improve  in  therapy,  because  this  may  lead  the  therapist  to 
expect  just  these  results  to  the  client's  disadvantage  (Meehl  &  Rosen, 
1955). 

It  is  often  necessary  for  the  clinician  to  indicate  a  therapy 
prognosis  for  an  individual.   If  the  clinician  can  predict  or  learn 
to  accurately  predict  whether  or  not  a  person  will  stay  in  therapy, 
he  is  providing  useful  information  for  the  person's  treatment. 

Thus  it  can  be  seen  that  clinicians  are  constantly  involved  in 
the  task  of  prediction  and  decision-making.   If  they  can  be  trained 
to  make  use  of  relevant  data  and  material,  they  may  improve  their 
predictions.   Although  mar.y  clinicians  look  VN'ith  disfavor  on  the  use 
of  tests,  tests  combined  v-jith  other  relevant  data  can  be  shovjn  to 
have  practical  and  research  applications.   The  clinician  may  use  them 
'to  better  his  predictions  and  decision-making  processes. 


Hypotheses  Tested 

The  present  study  was  addressed  to  two  objectives.   First,  to 
examine  the  decision-making  process  and  to  determine  VJhether  predic- 
tion accuracy  is  influenced  by  independent  variables  such  as  clinical 
experience  and  varying  amounts  and  kinds  of  information.   Second,  the 
question  of  vjhether  clinicians  can  be  trained  to  improve  their  clin- 
ical decision  processes  vsias  also  examined.   The  first  and  primary 
objective  was  studied  in  terms  of  the  second  objective,  a  real-life 
situation  that  is  meaningful  to  clinicians  today--the  problem  of 
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length  of  stay  in  psychotherapy.   If  increments  in  levels  of  statis- 
tical information  increase  prediction  eccuracy  and  thereby  improve 
the  clinicians'  decision  process,  this  type  of  information  may  be 
dovetailed  into  the  operation  of  a  clinic  and  taught  to  the  staff  to 
identify  high-risk  individuals.   Specific  questions,  or  hypotheses, 
were  raised.   Does  judgmental  accuracy  increase  as  more  information 
is  added  to  the  prediction  task  and  what  types  of  information  are 
most  useful  in  increasing  judgmental  accuracy?   Will  there  be  differ- 
ences in  accuracy  dependent  on  experience  level?   That  is,  will  grad- 
uate clinical  psychology  students  trained  in  statistical  decision 
theory  be  better  clinical  judges  than  experienced  PhD  clinical 
psychologists  (vjithout  such  training)  and  vjI  1  1  less  experienced 
psychologists  be  superior  to  more  experienced  clinicians?   Will  con- 
fidence and  appropriateness  increase  vvilth  increments  in  information 
and  will  there  be  differences  betvjeen  the  three  experience  levels, 
with  regard  to  their  confidence  and  appropriateness. 


METHOD 

Sub  i  ects.   Tv.elve  judges  (Js)  represented  three  levels  of 
experience  aoo    sophistication  in  statistical  decision-making,   A  pro- 
fessional (P)  group  of  four  PhD  clinical  psychologists  represented 
the  highest  level  of  clinical  experience.   A  group  of  four  clinical 
psychology  graduate  students  trained  (sophisticated)  in  statistical 
dec  is  ion-ma  King  theory  (SGS)  represented  the  highest  level  of  statis- 
tical sophistication.   Another  group  of  four  un^ophii  st  icated  (not 
trained  in  statistical  decision  theory)  clinical  psychology  graduate 
students  (UGS)  represented  the  saiue  experience  level  as  the  SGS  group 
and  the  same  level  of  statistical  sophistication  as  the  P  group. 
Sophistication  in  decision  theory  was  defined  as  pa  rt  i  c  i  fia  t  ion  in  a 
graduate  course  in  statistical  decision  theory  for  clinical  psychology 
students  at  the  University  of  Florida.   Sophistication  here  only  Im- 
plies special  training  and  by  no  means  implies  that  ttie  professionals 
were  clinically  unsophisticated. 

Ma  t  e  r  I  a  1 s .   Test  materials  for  Js   were  a  random  sample  of  100 
MMPI  profiles  of  clients  seen  in  a  university  mental  health  service. 
The  sample  profiles  were  drawn  from  2^1  profiles  of  all  clients  seen 
during  a  three-year  period.   Each  J  received  25  of  the  100  profiles. 

Profiles  viere  divided  into  tvjo  groups  based  on  the  client's 
length  of  stay  in  psychotherapy  at  the  mental  health  service.   A  short 
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stay  (S)  vjas  defined  as  four  or  less  therapy  sessions  end  a  long  stay 
(L)  as  five  or  more  therapy  sessions.   The  mean  length  or  stay  for 
the  S  group  was  2.00  sessions  and  for  the  L  group  9.27  sessions. 

A  discriminant  function  analysis  which  maximized  the  difference 
between  the  two  length  of  stay  in  psychotherapy  groups  v-.'as  run  on 
the  2^1  MMPI  profiles.   The  mean  discriminant  composite  scores  for 
the  two  length  of  stay  in  therapy  groups  on  the  13  ^'MP|  scale  vari- 
ables were  Z_, =29.7^1  for  the  few-session  group  (S)  and  1^-3^.26   for 
the  many-session  group  (L)  .   .An  analysis  of  variance  of  the  composite 
means  showed  a  significant  difference  between  the  two  groups  (£=^^1.19. 

o'f/-:12,22't,  p<.O0I).   A  com.monly  used  rule  of  Z=  -1  ■*"  -2  was  used  to 

2 

determine  the  optimal  predictive  cutting  I   score. 

With  an  emphasis  on  minimizing  the  false  negative  rate,  the  com- 
posite Z.  score  of  32.02  predicted  with  an  overall  hit  rate  of  67  per 
cent  for  the  original  protocol  pool.   False  negative  er'-ors  repre- 
sented those  clients  who  were  predicted  as,  : hort-stays  (S) ,  or 
negatives,  but  who  remained  long  in  therapy  (L).   It  was  felt  that 
this  predictive  error  was  more  serious  than  the  false  positive  error 
which  included  those  clients  who  were  predicted  as  long-stays  (L) 
but  who  remained  a  short  time  in  therapy  (S).   It  seemed  more  impor- 
tant to  identify  those  clients  who  really  needed  long-term  therapy 
than  to  identify  those  who  did  not.   Of  course,  some  of  the  individ- 
uals with  high  test  scores  who  stayed  only  a  fev;  sessions  may  have 
been  very  disturbed  but  dropped  out  of  therapy  prematurely.   There 
was  no  v;ay  to  identify  these  case's  when  a  very  disturbed  student  may 
have  3?nicked  cr  become  threatened  by  therapy  and  dropped  out  or 
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simply  missed  appointments.    The  false  negative  rate  for  the  Z  score 
of  32.02  vjas  .38,  tiic  false  positive  rate  was  .31,  giving  a  valid 
negative  rate  of  .69  and  a  valid  positive  rate  of  .62. 

Another  Z.  score  v.'liich  minimized  the  false  positive  error  pre- 
diction v.'ith  an  overall  accuracy  of  71  per  cent     vs'as  not  used  in 
the  present  study  for  the  reason  stated  above. 

Conditional  probabilities  v;ere  calculated  for  the  Z.  cutting 
score.   Conditional  probabilities  v.'ere  computed  v-jith  the  following 
equi)  t  ions : 

P'.U-r)    -   p(gp(,./L)  +  p(s)r(+/S)  ^""^  ^^^^  ''    P(S)P(-/S)  +  P(L)P(-/L) 
v^hert:  L=many  ther'-ipy  sessions  or  a  long  stay  in  therapy  (base  rate-.  66) 

S=fcw  therapy  sessions  or  a  short  stay  in  therapy  (base  rate=.3^0 

+-a  positive  test  score  (Z  ^32.02) 

--a  negative  test  score  (Z  <  32.02) 
For  the  Z.  score  of  32.02  tiie  conditional  probabilities  were: 
P(L/^)-c5i  snd  P(S/-)-.78.   With  this  ne.;  i  niorrrat  i  on  it  c^n  be  seen 
that  with  a  positive  test  score,  predictions  will  he   wrong  as  often 
as  they  are   correct.   But  given  a  negative  test  sco'e,  predictions 
will  be  riglit  7S  per  cent,  or  most  of  the  time. 

Finally,  s  random  sample  of  100  profiles  from  the  total  protocol 
pool  of  2^+1  cases  was  drawn.   This  was  done  so  that  the  J.s  would  have 
fevJer  pro;:ocois  to  judge,  making  their  task  more  economical  with 
regard  to  time. 


'Failure  to  control  this  factor  undoubtedly  lovjered  the  predic- 
tive accuracy  of  the  discriminant  function  equation  (and  perhaps 
clinical  judgment)  in  that  some  of  the  disturbed  profiles  in  the  (S) 
criterion  grcip  may  well  have  remained  (L)  if  they  had  not  dropped  out. 
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A  second  reason  for  dravjing  a  random  sample  was  to  make  the 
situation  more  relevant  clinically  in  terms  of  the  base  rates.   That 
is,  the  sample  had  only  approximate  base  rates  and  the  judges  did  not 
knov;  the  exact  probabilities  for  their  sample  of  those  viho  remained  a 
long  or  short  time  in  therapy.   Hov;ever,  for  the  sample,  the  2    score 
predicted  with  the  same  accuracy  that  it  did  for  the  total  protocol 
pool . 

Procedure.   Refer  to  Table  2  for  a  schematic  of  the  design.   J,5 
were  asked  to  predict  a  client's  length  of  stay  in  psychotherapy 
from  tiie  25  MHP!  profiles.   These  profiles,  the  sample  of  100  pro- 
files and  the  original  profile, pool  all  had  approximately  the  same 
base  rates;   3^  per  cent  of  the  clients  stayed  many  sessions  (L)  and 
66  per  cenl  stayed  a  few  se<;sion5  (S)  .   The  Js  predicted  length  of 
st.^y  in  therapy  (S  or  L)  during  four  sess  i  ons,  wi  th  additional  infor- 
mation added  incrementally  at  each  session.   These  sessions,  or 
level?  of  information,  represented  one  class  of  independent  variables 
Groups,  or  experience  level,  represented  the  other  class  of  indepen- 
dent variables. 

Each  J  made  his  predictions  on  the  same  25  protocols  that  he 
received  at  th.e  first  level  throughout  the  training.   Level  1:   Js 
were  first  given  KMPI  profiles  with  no  other  information.   Level  11: 
Js  were  again  presented  the  same  25  protocols  for  the  same  judgment 
but  with  the  additional  information  of  biographical  data  such  as  age, 
sex,  marital  status,  religious  preference,  parents'  marital  status, 
previous  counseling  experience,  and  subsequent  counseling  exper- 
ience.  Level  ill:   For  the  tliird  decision  task,  Js  were  given  the 
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profiles,  biographical  data,  with  the  additional  statistical  infor- 
mation of  the  cutting  score  based  on  discriminant  function  analysis. 
Valid  positive  and  false  positive  percentages  were   also  provided 
with  the  cut-off  I   score.   Level  !V:   Conditional  probabilities  and 
the  base  rates  were  added  to  the  previous  information  for  the  fourth 
presentation  of  profiles  for  prediction.   (For  a  copy  of  the  instruc- 
tions for  each  information  level  see  Appendix  A.)   For  each  judgment 
Js  also  indicated  their  confidence  in  the  accuracy  of  their  judg- 
ment . 

To  rule  out  a  practice  effect  from  repeated  presentation  of  the 
same  profiles,  two  control  judges  were  used  who  predicted  length  of 
stay  in  psychotherapy  using  profiles  only, with  no  additional  infor- 
mation on  four  separate  occasions. 

Judges  v.'ere  presented  the  profiles  for  judgments  on  four  days 
in  a  rov;  with  only  one  information  level  given  each  cay. 

Hypotheses.   I -- I nf ormat ion  Level:   It  was  hypothesized  that 
incre.nients  of  info'-mation  would  inciease  overall  judgmental  accuracy 
and  group  accuracy.   (A)  Level  I  accuracy  would  be  at  approximately 
the  level  of  chance.   (B)  At  Level  II,  accuracy  would  decrease  or 
remain  the  same.   ( C)  Level  III  accuracy  would  be  approximately  that 
of  the  actuarial  prediction  accuracy  of  the  d  i  scr  imi  nar,t  function 
analysis.   (D)  Level  IV  accuracy  would  increase  slightly  over  Level 
III  accuracy. 

I |--Exper ience  Level:   It  was  hypothesized  that  the  statistic- 
ally sophisticated  graduate  students  v.'Duld  be  the  most  accurate,  the 
statistically  unsophisticated  graduate  students  next  most  accurate. 
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and  the  professionals  least  accurate. 

I  I  I --Confidence  and  Appropriateness:   It  was  hypothesized  that 
confidence  ratings  v.'ould  increase  with  increments  of  information  and 
that  appropriateness  would  also  increase  with  more  information. 


RESULTS 

Accuracy:   The  Effects  of  Information  and  Experience 

Accuracy  was  defined  as  the  proportion  of  correct  judgments  per 
presentation  of  25  MMPl  profiles.   The  two  control  Js  showed  no  prac- 
tice effects.   Judge  A's  accuracy  was  52  per  cent  on  the  first 
presentation  and  ^48  per  cent  on  each  of  the  three  subsequent  presen- 
tations.  Judge  B's  accuracy  vias  distributed  across  sessions  as 
follows:   76  per  cent,  '(8  per  cent,  68  per  cent,  and  68  per  cent. 

Table  3  presents  accuracy  by  information  level  and  experience 
level.   Two  analyses  of  variance  were  conducted  to  determine  the 
effects  of  i  ."format  ion  level,  experience  level,  judges,  profile  set, 
and  profile.   The  analysis  of  variance  ■For  profile  set  effects  was 
non-sigoi  f  ici.nt  (f,'  =  .5'3,  df.=3,9I)-   An  F^^^^  test  for  homogeneity  of 
variances  between  groups  was  also  non-significant  (£niax~-^  •^-' '  — "^  ' 
d£=l6).   A  sumTiary  cf  the  analysis  of  variance  for  information  level 
and  expL^rience  level  effects  Is  shown  in  Table  k. 

.Information.   Mean  judgmental  accuracy  increased  consistently 
with  increments  of  information  from  Level  !  to  Level  IV  (X^,  =  .55, 
X.,=.6I,  X,,,=,67,  iju=.63).   These  differences  were  significant 
(F.^10.82,  df.=3,27,  £-<^.01).   A  graphic  presentation  of  this  trend 
is  shown  in  Figure  i.   Inspection  of  Figure  1  shows  approximately  a 
linear  increase  in  accuracy  for  the  three  groups  by  information 
level.   Both  the  P  and  UGS  groups  increased  their  accuracy  at  each 
level  w'l.  ile  the  SGS  group  shewed  increases  at  Levels  11  and  Ml 
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Table  3 
Proportion  of  Correct  Judgments 
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Exper  ience 

1 

nformat  ion 

Level 

Totals 

Level 

J 

1 

1  1 

1  1  1 

IV 

1 

.Sk 

.68 

.76 

.76 

.71 

SGS 

2 

.56 

.68 

.72 

.6h 

.65 

3 

.Gh 

.68 

.72 

.72 

.69 

k 

.60 

.60 

.72 

.68 

.65 

Total 

.61 

.66 

.73 

.70 

.68 

5 

.72 

.S'i 

.eh 

.76 

.69 

UGS 

6 

.^0 

.Sk 

.72 

.68 

.61 

7 

.36 

.52 

.48 

.56 

.48 

8 

.36 

A8 

.68 

.Sk 

.54 

Total 


,46 


63 


66 


58 


9 

.44 

.48 

.68 

.72 

.58 

p 

10 

.68 

.68 

.64 

.76 

.69 

11 

.64 

.68 

.68 

.72 

.68 

12 

.56 

.60 

.64 

.60 

.60 

Total 

.58 

.61 

.66 

.70 

.64 

Totals 

.55 

.61 

.67 

.69 

Control 

/\ 

.52 

.48 

.48 

48 

.49 

Control 

B 

.76 

.48 

.68 

.68 

.65 

Table  k 
Summary  of  Analysis  of  Variance  of  Accuracy 
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Source   of   Variation 
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Mean 
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9 

0.39 

0.69 
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27 

0.11 
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Profile 
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0.57 

Information  X  Profile 

864 

0.11 

""   Significant    at    the    .01     level 
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but  a  slight  decrease  in  accuracy  from  Level  III  to  Level  IV. 

Exper  ience.   There  VJere  no  differences  in  accuracy  due  to  exper- 
ience level  except  for  a  trend  tovjard  group  differences  (£=2.3'-t,  d.f= 
2,9,  p<.20).   The  SGS  group  vjas  the  most  accurate  and  the  UGS  group 
the  least  accurate  (XsGS=^-^S,  Xp=.6^4,  iuGS  =  -58).   Only  the  SGS 
group's  overall  accuracy  v;as  at  the  level  of  the  discriminant  func- 
tion VJhich  predicted  with  67  per  cent  accuracy. 

Information  and  experience  level  interaction.   The  only  other 
significant  source  of  variance   was  the  group  by  information  level 
interaction  (F-7.23.  df.=6,27,  p<.Ol).   The  Newman-Keuls  test  of 
differences  hetvveen  means  v-jas  used  (Kirk,  I9G8)  and  the  results  of 
tliis  analysis  are  given  in  Appendix  B.   The  interaction  v.as  based 
largely  on  a  significantly  lower  proportion  of  correct  judgn.ents  of 
the  UGS  group  at  Level  I.   The  UGS  group  not  only  started  with  the 
lowest  proportion  of  correct  judgments,  but  also  shovN'ed  the  most 
significant  Increase  in  accuracy  as  information  was  added.   Their 
final  degree  of  accuracy,  however ,  was  approximately  the  same  as  the 
SGS  accuracy  at  Level  I.'   The  UGS  group  significantly  increased 
their  level  of  accuracy  at  Levels  II!  and  IV  from  Level  I  VJhen  the 
composite  I    score,  conditional  probabilities,  and  base  rates  were 
added  (p<,01). 

The  only  significant  increase  in  accuracy  for  the  P  group  was 
between  the  first  level,  \-i\th    the  profile  only,  and  the  final  level 
with  all  information  (p<.05).   Increases  in  accuracy  for  all  groups 
across  information  levels  were  significant  except  the  increase  from 
Level  !!!  to  Level  IV  where  conditional  probabilities  and  base  rates 
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were  added.   Adding  conditional  probabilities  and  base  rates  to  the 
previous  information  did  not  result  in  a  significant  increase  in 
J  s '  accuracy  over  Level  Mi,  which  included  tiie  composite  Z.  score. 
For  the  SGS  group  there  were  no  significant  differences  in  accuracy 
across  Infornation  levels.   The  only  significant  group  differences 
within  information  levels  were  between  the  SGS  group  and  the  UGS 
group  (p<.01)  and  between  the  SGS  group  and  the  P  group  (p<;.05). 

Conf  idence 

Mean  confidence  scores  by  information  level  and  experience 
level  are  shov<Ti  in  Table  5-   Table  6  shows  the  summary  of  the 
analysis  of  variance  for  confidence  scores.   The  Js    confidence  in- 
c^cc;^•ed  s  ign  i  f  i  cent  1  y  as  subsequent  items  of  information  were  added 
to  the  protocols  for  all  groups  (£=^5-38,  df-2,28,  p-^.OS).   Although 
there  were  no  differences  betvjeen  confidence  scores  for  groups,  the 
P  group  tended  to  be  the  most  confident  and  the  SGS  groLip  tended  to 
be  the  least  confident  (Xp-76.86,  Xug;,-=72  .  72 ,  Xsqs=69.96).  j^^.^^ 
trends  are   shovv'n  in  Figure  2. 

Appropriateness 

A  neasure  of  appropriateness  (confidence  weighted  by  accuracy) 
was  measured  by  Pearson  product-moment  correlations  between  confi- 
dence scores  and  accuracy  scores  for  each  J^  at  each  level  of 
information.   There  were  no  significant  group  effects  or    interact- 
tions  but  the  SGS  group  tended  to  be  the  most  appropriate  (LsqS""^^' 
r  =.?G,  £  ,  -.i7).   The.  higher  the  correlation,  the  more  appropriate 
tlie  judgment.   Correlation  coefficients  for  appropriateness  are 
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Table  5 
Mean  Confidence  Scores 


Experience  Information  Level  Totals 

Level  J         I      11      Ml     IV     ' 


61.2 

Gk.O 

65.6 

65.6 

Sk.] 

61.2 

60.8 

60.8 

72.0 

63.7 

75.6 

7^.0 

73.6 

Ih.k 

7'^9 

72.0 

76.2 

80.L'r 

82.0 

77.7 

Total     67.5    68.8    70. 1    73.5     70.0 


s 

59.0 

58.0 

66.0 

69.2 

63.1 

6 

S^j.k 

86.3 

8i^.O 

92.2 

87.1 

7 

IS. 2 

79.8 

76.2 

82.2 

79.1 

8 

ek.2 

61.6 

61.6 

62.0 

62.4 

Total      71.7    71.6    71.5    le.k  72.7 


87.6 
6it.8 
85.2 
68.0 

87.2 
56.0 
87.8 
70.ii- 

88.4 
65.8 
86.4 
76.8 

90.0 
56.0 
86.6 
72.8 

88.3 
60.7 
86.5 
72.0 

75.2 

75.il- 

79.4 

76.4 

76.9 

Total 


Totals  71.4    71.9    73.8    75.4 
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Table   6 
Summary   of   Analysis    of   Variance   of    Confidence 


Source   of   Var  i at  ion  df.                          MS                         F 

Inforniat  ion   Level  2                         52.6?                  5.38-' 

ExperieiiCt^   Level  2  191.8^*                    .39 

Information   X    Experience  6                         12.73                  1.30 

Judges   vvithin   Groups  9        '  ^93.76 

lnforrr,at  ion    X    Judges  28                            9-79 

".'>'   Significant   at    the  .05    level. 


29 


70 


60 


50 


Cl n   SGS 

A 1       UGS 


I  M  III 

Information  Level 
Fig.  2.   Confidence  by  information  levels. 
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giveri  in  Teble  7  vjith  the  analysis  of  variance  summary  for  appro- 
priateness in  Table  8.   The  analysis  of  variance  was  based  on  1 
transformations  of  the  correlation  coefficients.   Mean  appropriate- 
ness scores  were  significantly  higher  at  each  level  of  information 
(F=22.03,  cLf=2,28,  p<.01). 

The  SGS  group  was  most  appropriate  because  they  were  most 
accurate  and  not  overconfident,  that  is,  their  confidence  was  con- 
sistent VJith  their  accuracy.   The  P  group  was  overly  confident  and 
the  UGS  group  vs'as  less  accurate,  making  these  two  groups'  confidence 
inconsistent  vJith  their  accuracy.   These  trends  can  be  seen  in 
Figure  J). 

Judges  versus  the  discriminant  function 

The  discriminant  function  correctly  classified  67  per  cent  of 
the  profiles.   This  information  Vv'as  given  to  the  J_s  at    Level  Ml. 
At  Level  ill  only  the  SGS  judges  v;ere  more  accurate  than  the  discri- 
minant function  with  a  hit  rate  of  73  per  cent.   The  P  group  had  a 
hit  rate  of  66  per    cent  and    the  UGS  group  had  a  hit  rate  of  63  per 
cent  at  Level  111.   The  accuracy  for  all  judges  combined  was  67  per 
cent.   Five  judges  (two  in  the  SGS  group,  two  in  the  P  group  and  one 
in  the  UGS  croup)  were  more  accurate  overall  than  the  linear  regres- 
sion _Z  score  and  only  one  J.  (in  the  UGS  group)  operated  below  the 
chance  level  Ov-erall. 

At  Level  I,  four  Js  had  accuracy  scores  below  the  level  of 
chsncs  and  tv.o  otliers  vjere  only  slightly  above  cliance.   However, 
none  of  tr.e  four  J^s  who  was  belovj  chance  wzrc    in  the  SGS  group.   At 
Level  1!  thr^re  were  two  J.S  below  chance  and  one  slightly  above,  again, 


Table  7 
Correlation  Coefficients  for  Appropriateness 
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Exper  ience 

informat 

ion 

Level 

Totals 

Level 

J 

1 

1  1 

III 

IV 

1 

.35 

.35 

.3^ 

.28 

.33 

SGS 

2 

.25 

.20 

.18 

.65 

.32 

3 

.3iv 

.31 

.^.i 

.28 

.3^ 

k 

.11 

.06 

.01 

.02 

.05 

Total 


26 


23 


2k 


26 


UGS 


5 

.13 

-.06 

.24 

-.03 

.07 

6 

-.03 

-.15 

.10 

.47 

.10 

7 

-.05 

-.03 

.30 

.47 

.17 

8 

.29 

.37 

.41 

.27 

.34 

Total 


.09 


,04 


,26 


30 


17 


27 

.20 

.17 

.33 

06 

.23 

.33 

.31 

11 

.19 

.29 

.28 

02 

-.07 

.08 

.41 

Total 


14 


33 


,20 


Totals 


24 


,31 
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Table  8 
Summary  of  Analysis  of  Variance  of  Appropriateness  Correlations 


Source  of  Varidtion 


df 


MS 


Information  Level  2 

Experience  Level  2 

information  X  Experience  6 

Judges  within  Groups  9 

Information  X  Judges  28 


0705 

22.03^ 

0366 

2.26 

0032 

2.25 

0162 

0072 

""  Significant  at  the  .01  level. 
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Fig.  3.  Appropriateness  correlations  between  accuracy 
confidence  by  information  levels. 
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none  of  these  Js  was   in  the  SGS  group.   V/ith  the  addition  of  the 
statistical  information,  only  one  J.  (in  the  UGS  group)  remained  near 
the  chance  level  of  accuracy  and  he  vjas  the  least  accurate  of  all 
the  Js. 


DISCUSSION 

The  present  study  demonstrated  that  judges  can  substantially 
improve  their  decision  accuracy  v>fhen  provided  vjlth  increments  of 
information,  particularly  statistical  information.   This  finding 
extends  the  earlier  findings  reported  for  a  different  clinical  judg- 
ment task  (Shagoury  &  Satz,  ^SbS)    and  contrasts  wi th  "  prev ious  studies 
which  have  used  non-quantitative  data.   These  findings  also  suggest 
that  if  the  clinician  is  able  to  incorporate  quantitative  information, 
he  may  improve  his  ov;n  decision-making  ability  and  equal  or  surpass 
the  accuracy  of  actuarial  methods. 

The  findings  of  the  present  study  also  shov.'ed  that  accuracy 
increased  directly  as  a  function  of  the  amount  of  information  avail- 
able to  the  judges.   Two  conclusions  that  can  be  drawn  from  this 
finding  are  that  the  information  v;as  relevant  to  tiie  judgmental  task 
and  that  the  judges  used  this  information  in  formulating  their 
dec  is  ions. 


Information 

A  post-testing  interview  revealed  that  the  type  of  information 
used  varied  between  groups,  among  judges,  and  betvjeen  information 
levels.   Hov^ever,  the  interview  was  not  structured  enough  to  deter- 
mine the  actual  decision  rules  used  by  the  judges. 

At  Level  I,  witii  the  HMPi  profile  only,  most  judges  used  their 
35 
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ovjn  intuition  about  the  relationships  of  vjliich  scales  were  elevated 
and  the  extent  of  these  elevations  to  the  length  of  stay  in  treat- 
ment criterion.   There  was  a  great  deal  of  individual  variation  in 
approach  since  each  judge  had  differenct  training  experiences  with 
the  MMPI.   The  judges  of  the  SGS  group  had  the  most  similar  training 
experience  in  the  use  of  the  I'nMPI  since  some  training  in  the  ration- 
ale and  use  of  this  instrument  was  given  in  the  statistical  decision 
theory  course.   The  SGS  group  also  shov;ed  the  least  amount  of  indi- 
vidual variation  in  accuracy  at  Level  I.   The  other  group  of  graduate 
students  (UGS)  had  the  least  amount  of  exposure  to  the  KMPI.   The 
UGS  group  was  barely  familiar  vjith  this  test  instrument  and  none  of 
these  judges  had  had  any  formal  training  in  its  use.   It  is  inter- 
esting to  note  that  the  group  of  unsopti  i  st  i  Cdted  graduate  students 
shoveled  the  lowest  accuracy  througliout  and  v.'as  the  only  group  whose 
accuracy  was  never  below  the  level  of  chance.   It  seems  then,  that 
the  more  familiar  a  judge  is  with  a  test  instrument,  the  more  accur- 
ate he  will  be  in  using  it  for  prediction. 

The  SGS  group  vjas  not  familiar  with  the  specific  type  of  task 
used  in  the  present  study.   That  is,  they  had  not  been  trained  in 
correlating  MMPI  data  to  length  of  stay  in  psychotherapy.   This 
aspect  of  the  study  vjas  novel  to  each  of  the  three  groups. 

At  Level  II  again  each  judge  approached  the  data  differently 
and  selected  certain  measures  to  use  in  mai<ing  his  decisions.   The 
variation  within  groups  decreased  and  there  was  less  difference  in 
this  variation  between  groups.   Accuracy  increased  or  stayed  the 
seme  for  all  but  one  judge  whose  accuracy  dropped.   Thus,  the 
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hypothesis  that  accuracy  vjould  decrease  at  Level  II  was  not  supported, 
It  was  originally  felt  that  all  of  the  biographical  data  provided 
would  make  the  task  more  complex  and  more  difficult  and  would  thus 
confuse  the  judges.   However,  the  judges  were  able  to  relate  some  of 
the  information  to  the  task  and  thereby  improve  their  judgments. 
Most  judges  used  some  combination  of  factors.   V/hether  the  profile 
subject  had  previous  counseling  or  subsequent  counseling  and  his  age 
were  the  factors  used  most  often.   Some  of  the  judges  also  considered 
marital  status  when  the  subject  was  married.   This  finding  (Level  II) 
contrasts  with  other  studies  which  indicate  lowering  of  accuracy 
vjhen  data  are  combined  (Golden,  I96'i)- 

In  support  of  the  hypothesis,  overall  accuracy  for  Level  III 
was  the  same  as  the  discriminant  function's  accuracy  of  67  per  cent. 
With  the  addition  of  1   scores  at  Level  III,  only  one  judge,  who  was 
in  the  P  group,  used  the  cut-off  score  exclusively.   In  this  same 
group  one  judge  changed  none  of  his  judgments  from  Level  I!  and  the 
other  two  judges  used  primarily  their  own  subjective  inferences.   The 
UG5  group  essentially  ignored  the  Z.  scores  and  relied  on  their  own 
intuition  and  thus  did  not  reach  the  level  of  accuracy  of  the  cut- 
off score.   All  judges  in  the  SGS  group  combined  the  cut-off  score 
data  vjith  their  own  intuition  to  improve  upon  the  accuracy  of  the 
discriminant  function.   These  findings  imply  that  the  clinician  can 
make  use  of  his  intuition  and  experience  but  not  at  the  expense  of 
ignoring  available  data,  particularly  when  they  include  quantitative 
information.   The  findings  also  imply  that  the  most  accurate  judges 
arc.   the  ones  who  are  able  to  utilize  statisticei  data. 
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The  fact  that  there  v/as  an  increase  in  accuracy  from  Level  III 
to  Leve!  IV, but  that  this  increase  was  non-significant,  supported  the 
hypothesis  that  Level  IV  accuracy  would  increase  slightly  over  Level 
II!  accuracy.   For  the  SGS  group  there  was  a  slight  decrease  in 
accuracy.   One  reason  for  this  decrease  might  have  been  the  informa- 
tion itself.   These  judges  had  been  trained  to  use  more  pov-Jerful 
statistical  information,  that  is,  data  that  discriminated  groups  and 
sub-groups  more  than  did  the  data  of  the  present  study.   The  base 
rates  of  .65  end  .35  vjere  not  sufficiently  different  from  base  rates 
of  .5  to  be  of  much  help.   Also  the  conditional  probabilities  were 
not  high  enough  to  provide  maximum  discrimination.   All  of  the 
statistical  data  given  VJere  in  approximately  a  2/3  to  1/3  ratio. 
Because  all  of  the  statistical  data  had  approximately  the  same  pre-- 
dictive  power,  it  may  have  been  difficult  to  knovrj  which  kind  of 
information  would  be  most  useful.   Instead,  judges  may  have  tried  to 
coH-bine  two  or  more  kinds  of  data  and  as  a-  result  were  less  accurate 
than  they  would  have  been  using  one  type  exclusively.   Quantitative 
information  is  most  useful  when  it  represents  higher  ratios,  such  as 
base  rates  of  .2/. 8  or  .1/.9;  conditional  probabilities  of  .85/. 15; 
and  cut-off  scores  of  75  per  cent  or  higher. 

Even  thougii  five  out  of  the  twelve  judges  showed  decreases  in 
accuracy  at  Level  IV  in  comparison  with  Level  III,  these  decreases 
were   slight  and  represented  only  one  more  incorrect  judgment  out  of 
the  25  judgments  for  all  five  judges. 

At  Level  IV,  the  UGS  group  ignored  the  base  rates  and  used  the 
conditional  probabilities.   The  SGS  group  and  the  P  group  both  used 
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a  combination  of  conditional  probabilities  and  base  rates  and  both 
of  these  groups  had  tlie  same  degree  of  accuracy  at  Level  IV.   Also, 
both  the  SGS  and  P  groups  were  more  accurate  than  the  UGS  group. 

It  seems  probable  that  statistical  information  is  more  impor- 
tant than  biographical  information  about  the  subjects  since  there 
was  a  greater  increase  in  accuracy  with  the  addition  of  statistical 
information.   Other  studies  have  siiown  that  biographical  data  are  of 
minimal  value  to  judges.   Golden  {]3Gk)    found  that  judges  agreed 
less  in  their  description  of  protocols  v-.'hen  they  were  given  identi- 
fying data  alone  than  when  they  were  given  a  single  psychological 
test  or  a  combination  of  tests.   Kostlan  (195'+)  found  that  judges 
VJere  more  accurate  in  their  psychod  iagnoses  when  they  received  both 
social  cose  histories  and  the  more  quantitative  MMPI  protocol  than 
when  they  received  social  case  histories  alone. 

One  may  ask  if  the  judges  would  liave  been  more  accurate  had  they 
been  given  some  feedback  on  their  accuracy  at  each  levol  of  infor- 
mation.  This  is  possible  but  then  the  task  would  not  have  been  as 
life-like  in  the  sense  that  clinicians  in  actual  situations  must 
usually  wait  some  time  before  learning  the  accuracy  of  their  predic- 
tions.  However,  this  does  emphasize  the  point  that  clinicians  should 
check  the  accuracy  of  their  predictions  when  possible  and  learn  what 
helps  them  to  predict  most  accurately. 

Exper i ence  and  training 

The  lack  of  overall  differences  between  groups  was  not  antici- 
pated.  It  was  assumed  that  the  SGS  group  would  have  benefited 
from  tlieir  training  in  statistical  decision  theory.   However, 
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artifacts  in  the  design  tended  to  v.'ash  out  group  effects  by  providing 
a  guaranteed  hit  rate  if  the  I   scores  vjere  used  at  Levels  III  and  IV. 
The  convergence  of  judgmental  accuracy  for  each  group  at  Level  IV 
lends  some  support  for  this  argument. 

The  hypothesis  that  the  SGS  group  v-Jould  be  the  most  accurate 
was  thus  only  tentatively  supported  since  there  vias    not  a  signifi- 
cant group  effect.   However,  the  SGS  group  tended  to  be  the  most 
accurate  in  their  judgments.   This  finding  implies  that  clinicians 
can  be  trained  to  improve  their  ov.'n  subjective  inferences  vnth 
statistical  information.   These  Judges  trained  in  statistical  de- 
cision theory  were  able  to  add  thsir  own  intuitive  judgments  to  the 
statistical  datu  and  thereby  predict  more  accurately  than  did  the 
discriminant  function  alone  or  than  they  had  done  without  the  statis- 
tical data.   This  special  training  taught  them  not  only  now  to  use 
statistical  information  but  also  how  to  use  their  clinical  intuition 
to  its  best  advantage.   The  SGS  group  also  tended  to  be  the  most 
appropriate,  that  is,  to  know  when  their  judgments  were  most  accurate 
and  when  they  were  most  inaccurate. 

The  fact  that  the  P  group  tended  to  be  more  accurate  than  the 
UG5  group  was  also  unexpected  in  light  of  previous  findings  con- 
cerning amount  of  clinical  training  and  accuracy  of  prediction. 
This  finding  does  not  support  the  previous  evidence  (Goldberg,  1959; 
Oskamp,  1962;  Shagoury,  1969,  Shagoury  &  Satz,  1969)  that  as  the 
amount  of  clinical  experience  increases,  prediction  accuracy 
decreases.   In  the  present  study,  judges  in  the  SGS  and  P  groups 
used  familiar  methods,  the  MHPI  profiles  or  statistical  data;  the 


k] 


UGS  group,  by  contrast,  was  presented  v.'ith  essentially  unfamiliar 
prediction  tools.   One  reason  data  in  the  present  study  were  at 
variance  VJith  previous  findings  is  that  previous  studies  required 
clinicians  to  predict  an  unknovjn  criterion  or  to  use  unfamiliar 
methods  so  that  any  previous  "set"  of  the  clinician  was  not  advanta- 
geous.  In  the  present  study,  the  j udges'  fami 1 ia r i ty  with  either  the 
MMPI  or  statistical  types  of  data  helped  them  in  their  predictions. 

Interaction  effects 

The  significant  group  by  information  interaction  effect  showed 
at  least  indirect  support  for  the  experience  level  hypothesis  that 
the  SGS  group  Vviould  be  most  accurate.   This  interaction  showed  that 
the  unsophisticated  graduate  students  started  off  predicting  belov-; 
chance  and,  finally  at  Level  IV,  reached  the  level  of  accuracy  that 
the  sophisticated  graduate  students  attained  at  Level  I  (MMPI  pro- 
files alor.e).   It  was  the  former  group  with  the  least  amount  of 
experience,  familiarity  with  the  KMPI,  and  sophistication  with  the 
statistical  decision  theory  which  accounted  for  most  of  the  group 
differences  and  much  of  the  interaction  effect. 

The  rest  of  the  interaction  effect  was    due  to  the  changes  in 
accuracy  across  information  levels  vnth  the  UGS  group  shcv\iing  the 
greatest  change  and  the  SGS  group  showing  the  least  amount  of 
change  in  accuracy.   The  latter  group  started  out  predicting  fairly 
accurately  and  had  less  room  for  improvement  while  the  former  group 
started  out  so  poorly  that  their  improvement  was  marked.   The  SGS 
group  predicted  almost  as  well  as  the  discriminant  function  with 
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the  profiles  only.   The  UGS  group  improved  from  belovvi  chance  to  the 
level  of  accuracy  actiieved  by  the  discriminant  function. 

Conf  i  dence 

Previous  studies  would  suggest  that  the  trained  clinicians 
should  have  had  less  confidence  in  their  judgments  than  the  tv-jo  pre- 
professional  groups.   Although  group  differences  for  confidence  viere 
non-significant,  the  professionals  in  the  present  study  tended  to  be 
the  most  confident.   Again,  this  may  have  been  because  they  were 
using  the  KMPl  with  which  they  v;ere  more  familiar  than  viere   the  other 
two  groups.   Also  the  professional  group  was  predicting  a  criterion 
about  which  they  knew  sox.ething,  that  is  length  of  stay  in  psycho- 
therapy.  This  again  suggests  that  previous  studies  have  placed  the 
clinician  at  a  disadvantage  so  that  he  is  less  accurate  and  less  con- 
fident than  he  would  be  predicting  in  a  familiar  setting. 

In  general,  adding  information  substantially  increased  the 
judges'  confidence.   Judges  becarrie  more  confident  as  well  as  more 
accurate  vv'itli  increments  in  information.   However,  the  UGS  group's 
confidence  did  not  increase  until  they  had  all  the  available  infor- 
ma  t  i  on . 

Cne  problem  with  asking  judges  to  assign  a  confidence  rating  to 
each  judgment  was  that  each  judge  had  a  different  standard  or  set 
for  measuring  how  confident  he  was.   The  range  of  confidence  scores 
used  also  varied  between  judges  and  within  groups  so  that  one  judge 
used  all  six  possible  levels  of  confidence  ranking  (50,  60,  70,  80, 
90,  and  100)  v-;hile  another  judge  only  used  tvio    (60,  70)  or  three 
(80,  50,  100)  rankings. 
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Appropr lateness 

The  most  meaningful  measure  to  express  appropriateness,  defined 
as  the  relationship  betv.'een  accuracy  and  confidence,  was  the  corre- 
lation coefficient.   Just  as  accuracy  and  confidence  increased  with 
each  level  of  information,  so  did  appropriateness.   As  judges  became 
more  accurate  they  also  became  appropriately  more  confident. 

The  increases  in  appropriateness  follovjed  the  same  pattern  as 
the  increases  in  accuracy  and  confidence.   That  is,  appropriateness 
increased  significantly  across  levels  of  information  but  there  was 
only  a  tendency  for  one  group  to  be  more  appropriate  than  the  other 
groups.   As  with  accuracy,  the  SGS  group  tended  to  be  the  most  appro- 
priate end  the  UGS  group  tended  to  be  the  least  appropriate.   This 
contrasts  earlier  findings  that  trained  clinicians  are  more  appro- 
priate in  their  confidence  levels  than  are  graduate  students  in 
psychology  (Oskanp,  1962,  Shegoury,  1965).   The  findings  of  the 
present  study,  however,  do  not  cont rad i ct -ea r 1 i er  findings  since  the 
present  differences  between  groups  on  the  measure  of  appropriateness 
were  non-significant. 

Appl icat  ions 

It  appears  that  actuarial  data  and  training  in  their  use  can  be 
applied  to  situations  in  which  clinicians  must  predict  and  make  de- 
cisions.  In  the  present  study,  judges  were  able  to  post-diet  length 
of  stay  in  psychotherapy  fairly  v-Jel  1  .   The  next  step  v^jould  be  to 
apply  these  techniques  to  the  same  setting  and  £redict  a  client's 
length  of  stay  in  psychotherapy.   This  could  then  be  followed  up  at 
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the  end  of  treatment  as  a  check  of  prediction  accuracy.   This  would 
enable  the  clinician  to  determine  vjhich  short  stays  were  "no  shovjs'' 
and  which  were  treated.   Thus  the  discriminant  function  I    score  and 
judges  predictions  could  be  much  higher  and  more  useful  for  practical 
application  to  the  clinician's  population  of  clients.   This  type  of 
procedure  is  most  useful  in  a  clinic  situation  vjhich  must  limit  the 
number  of  clients  seen  or  must  screen  those  that  will  be  seen. 
Statistical  methods  of  prediction  can  be  particularly  applicable  to 
the  screening  of  patients  to  determine  what  type  of  treatment  is  most 
appropriate  and  vjould  be  most  useful  for  each  client. 

To  use  actuarial  data  in  a  clinic  s i tuat i on , they  must  first  be 
collected  and  analyzed.   Too  many  clinical  situations  today  fail  to 
make  use  of  the  data  they  have  available.   They  do  not  even  know  the 
base  retes  for  vario'JS  classifications  of  the  clients  they  see. 
Collecting  and  analyzing  statistical  data  is  another  way  to  more 
fully  understand  a  particular  clinical  setting  by  learning  what  type 
of  patients  are  seen,  how  long  they  stay  in  treatment,  and  hopefully, 
which  ones  are   most  likely  to  improve. 

If  a  clinic  decides  to  see  everyone  v.'ho  comes  in  for  help,  tests 
and  statistical  data  are  not  of  benefit  in  sel ect i ng  whom  to  see. 
However,  these  data  might  be  used  for  prediction  and  research  in  a 
setting  which  sees  all  clients.   It  is  in  situations  where  everyone 
cannot  be  treated  that  improving  tests  and  collecting  base  rate 
information  is  most  needed.   V/here  decisions  and  predictions  must  be 
made,  actuarial  methods  are   most  needed  to  improve  the  clinician's 
dec  i  s  ion-.mski  ng  ability. 
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A  further  study  v^Jhich  would  be  a  fair  and  optimal  test  of  clini 
cal  versus  statistical  prediction  v.'ould  be  to  give  judges  an  oppor- 
tunity to  see  tiic  relat  ionsln  i  ps  of  test  variables  v\iith  a  criterion 
on  a  standardization  sample.   Then,  the  judges  would  be  compared 
with  a  discriminant  function  on  a  cross  validation  sample.   However, 
this  vjas  not  trie  purpose  of  the  present  study. 


SUMMARY 

The  present  study  was  designed  to  look  at  the  effects  of  adding 
quantitative  and  qualitative  data  to  a  relevant  clinical  judgment 
task.   In  essence,  it  compared  judges  with  varying  degrees  of  clini- 
cal experience  to  actuarial  prediction  methods.   The  study  also 
attempted  to  train  judges  to  use  actuarial  information  to  improve 
their  prediction  accuracy. 

Twelve  judges  representing  three  levels  of  clinical  experience 
marie  pcst-dictive  judgments  on  the  length  of  stay  in  psychotherapy 
(short  or  long)  fro.i  a  Sc^T.ple  of  MMPI  profiles  of  clients  seen  in  a 
university  mental  health  service.   Judgments  were  made  under  four 
conditions  in  which  qualitative  ;nd  quantitative  information  was 
added  incrementally  at  each  level.   The  three  levels  of  judges'  ex- 
perience were  professional  clinical  psychologists,  "sophisticated" 
third  year  clinical  psychology  graduate  students  trained  in  statis- 
tical decision  theory,  and  "unsophisticated"  third  year  clinical 
psychology  graduate  students  vjithout  any  training  in  statistical 
decision  theory. 

Accuracy  increased  over  levels  of  information  but  there  were  no 
differences  in  accuracy  for  the  three  levels  of  experience.   A  sig- 
nificant group  by  information  level  interaction  demonstrated  some 
group  effects  cue  to  a  lower  proportion  of  correct  juogmerits  for  the 
less  experienced  judges  under  conditions  involving  the  least  amount 
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of  inf  oruii^t  ion. 

Judges  became  more  confident  in  tfieir  judgments  as  they  received 
more  information.   Appropriateness,  defined  as  accuracy  weighted  by 
confidence  and  measured  by  correlation  coefficients  between  accuracy 
and  confidence,  increased  substantially  as  increments  of  information 
were  added.   The  group  trained  in  statistical  decision  theory  tended 
to  make  the  most  appropriate  judgments  and  the  least  experienced 
group  of  graduate  students  tended  to  make  the  least  appropriate  judg- 
ments. 

The  present  study  showed  that  clinicians  can  use  quantitative 
data  to  improve  their  own  judgmental  ability  and  to  predict  more 
accurately  than  actuarial  data  alone.   Also,  since  triose  judges  with 
the  most  experience  in  using  actuarial  tasks  tend  to  be  the  most 
appropriate  in  their  judgments,  this  implies  that  clinicians  can  also 
be  trained  to  be  more  appropriate  and  to  know  when  their  judgments 
are  more  likely  to  be  accurate. 


APPENDIX      A 
INSTRUCTIONS 
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APPENDIX  A- 1 


INSTRUCTIONS  -  PART   I 


This  study  is  designed  to  exa.'nine  the  decision  process  v;hen  only 
limited  inforni?tion  is  available.   You  will  be  presented  with  25 
Minnesota  Multipfiasic  Personality  Inventory  (MMPI)  profiles  of  stu- 
dents seen  at  the  University  of  Florida  Infirmary  Mental  Health 
Service.   Some  of  these  students  stayed  a  long  time  in  therapy  (5  or 
more  sessions,  X=9)  snd  some  stayed  only  a  short  time  (4  or  less 
sessions,  X=2)  ,   Your  task  v-;ill  be  to  decide  v.'hich  students  stayed 
a  long  time  (L)  and  which  stayed  only  a  short  time  (S)  on  the  basis 
of  the  test  profile  alone. 

Your  task  is  to  try  to  make  the  best  estimate  of  probable  length 
of  stay  in  psychotherapy  given  only  limited  information.   It  is  pos- 
sible to  correctly  classify  all  the  profiles.   It  is  hoped  that  your 
predictions  will  in  some  way  help  us  to  understand  one  aspect  of  the 
decision-making  process  as  it  is  applied  by  psychologists  in  clinical 
sett  ings. 

You  will  also  be  asked  to  rate  your  confidence  for  each  subject 
on  a  scale  from  50  per  cent  to  100  per  cent.   If  you  are  positive  of 
your  decision,  you  should  mark  100  per  cent;  if  you  are  only  guessing 
you  should  mark  50  per  cent.   That  is,  the  more  certain  of  your  de- 
cision, the  higher  percentage  you  should  mark. 
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APPENDIX  A-l! 
INSTRUCTIONS  -  PART   II 

Your  task  on  Part  II  is  identical  to  that  on  Part  I.   You  will 
be  presented  the  same  25  profiles  and  asked  to  predict  (S)  or  (L). 
Hovjever,  this  time  more  i  nforn-iat  ion  VJill  be  available  to  you.   That 
is,  you  vi']]]    also  hove  biographical  data.   You  may  use  this  infor- 
mation in  any  v.'ay  you  wish.   You  may  choose  to  disregard  the  infor- 
mation altogether  and  make  your  predictions  as  you  did  in  Part  I. 

Your  task  is  to  try  to  make  ttie  best  estimate  of  probable  length 
of  stay  in  therapy  given  only  limited  information.   It  is  possible  to 
correctly  classify  all  the  profiles.   It  is  hoped  that  your  predic- 
tions will  in  some  way  help  us  to  understand  one  aspect  of  the 
decision-making  process  as  it  is  applied  by  psychologists  in  clinical 
sett  ings. 

Again,  please  indicate  your  confidence  in  your  judgment  for  each 
subject  from  50  per  cent  to  100  per  cent. 
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APPENDIX  A-l! i 
INSTRUCTIONS  -  PART   I  II 

Your  task  on  Part  III  is  identical  to  that  of  Parts  I  and  II. 
YoLi  will  be  presented  the  same  25  profiles  and  asked  to  predict  as 
accurately  as  possible,  on  the  basis  of  the  information  given, 
whether  the  student  Is  (S)  or  (L).   Again,  more  information  will  be 
made  available  to  you.   The  follov.'ing  statistical  information  vn  1  I 
be  added. 

Discriminant  function  analysis  provided  weights  for  each  of  the 
13  MMP!  scale  variables  in  order  to  obtain  maximal  differentiation 
between  long  stayers  (I.)  and  short  slayers  (S).   A  composite  score 
"  (_Z)  v-jas  obtained  which  best  estimates  the  combined  relative  effects 
of  all  the  scale  variables. 

This  Z  score  is  used  to  make  the  best  prediction  as  to  vjhich 
criterion  group  a  particular  profile  belongs.   This  can  be  summarized 
as  fol lows : 

1.  Zi32.02  is  a  positive  test  sign  {■{)    and  indicates  a  probable 
long  stay  in  therapy  (L). 

2.  Z<32,02  is  a  negative  test  sign  (-)  and  indicates  a  probable 
short  stay  in  therapy  (S). 

No  test,  however,  classifies  without  some  errors.   This  derived  com- 
posite cut-off  (Z=32.02)  yields  the  follov'jing  percentages  of 


class  if icat ion: 
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Composite  test  sign 


Cr iter  ion 


Z<32.02 
I  > 32.02 


S 

69% 


31% 


L 

38% 


62% 


In  other  vjords: 

1.  A  (-0  test  sign  (Z^32.02)  correctly  classified  62  per  cent 
of  the  long  steyers  (L).   This  is  knovm  as  the  ya  lid  pos.itive 
_ra t e .   Also,  a  (+)  test  sign  incorrectly  classified  31  per 
cent  of  the  short  stayers  (S)  and  this  is  the  false  positive 
rate. 

2.  A  (-)  test  sign  (Z'i32.02)  correctly  classified  69  per  cent 
of  (S) ,  the  valid  neoatjya  rate,  and  incorrectly  classified 
38  per  cent  of  (L) ,  the  false  negative  rate. 

This  means  that  38  per  cent  of  the  (l-)'5  scored  below  32,02  and  were 
incorrectly  classified  (S),  and  31  per  cent  of  the  (S)'s  scored  above 
32.02  and  were  incorrectly  classified  as  (L) .   The  total  percentage 
correctly  classified  was  67  per  cent. 

You  will  be  required  to  predict  as  accurately  as  possible 
v-jhether  the  student  belongs  to  (S)  or  (L)  ,  short  or  long  stay.   It  is 
possible  to  score  every  profile  correctly  scoring  100  per  cent. 

You  r.iay  predict  (S)  or  (L)  by  using  (l)the  composite  Z  score 
cut-off,  (2)the  biographical  data,  (3)the  profile  alone,  or  (^)any 
combination  of  (1),  (2),  and  (3).   The  composite  cut-off  score  v-vas 
applied  to  yield  the  best  overall  classification  rate  but  no  test 
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is  perfect  and  errors  may  be  made  with  any  procedure.   It  is  quite 
possible  that  the  clinician  may  be  able  to  improve  upon  tiie  linear 
statistical  method  (_Z  score)  by  utilizing  combinations  of  both 
"intuitive"  and  statistical  data. 

Your  task,  is  to  try  to  make  the  best  estimate  of  probable  length 
of  stay  in  therapy  given  additional,  but  limited,  information.   It  is 
possible  to  correctly  classify  all  the  profiles.   It  is  hoped  that 
your  predictions  will  in  some  way  help  us  to  understand  one  aspect 
of  the  decision-making  process  as  it  is  applied  by  psychologists  in 
cl inical  sett  ings. 
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APPENDIX  A- IV 


INSTRUCTIONS  -  PART   IV 


Your  task  on  Part  IV  is  identical  to  that  of  Parts  I,  II,  and 
III,  utilizing  the  same  25  profiles.   You  arc  to  predict  as  accur- 
ately as  possible  on  the  basis  of  the  information  given,  VJhether  the 
student  is  (S)  or  (L) .   Again,  more  information  will  be  made  avail- 
able to  you.   In  addition  to  the  composite  Z_  score,  biographical 
data,  and  test  data,  you  will  also  be  told  the  conditional  probabili- 
ties and  base  rates  for  the  groups  and  test  signs. 

Conditional  probabilities  combine  test  signs,  (+)  or  (-) ,  and 
base  rates  to  yield  a  quantitative  index  of  the  probability  of 
correct  classification  when  Z>  32.02  (+)  or  when  Z^32.02  (-). 

For  example,  some  of  the  subjects  will  be  (L)  v.'hen  Z_i32,02  (+) 
and  some  will  be  (S)  when  Z^<  32.02  (-).   The  problem  is  to  determine 
how  confident  vis   can  be  with  each  test  sign  under  the  base  rates  of 
the  population.   The  base  rates  for  the  two  groups  are:   Short  (S)- 
66  per  cent  and  Long  (L)=3'+  per  cent.   In  other  words,  3^  per  cent 
of  the  subjects  stayed  a  long  time  in  therapy  and  66  per  cent  stayed 
only  a  short  tine.   The  majority,  therefore,  were  shorts  (S). 

Based  on  this  information,  the  conditional  probabilities  are: 
for  a  (+)  test  sign,  P(L/+)=.51  and  for  a  (-)  test  sign,  P(S/-)=.78. 
This  means  that  the  probability  of  a  person  staying  a  long  time  in 
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therapy  (L) ,  given  a  positive  test  sign,  is  .51,  and  the  probability 
of  a  person  staying  a  short  time  (S),  given  a  negative  test  sign,  is 
.78. 

A  conditional  probability  of  .51  for  a  (+)  test  sign  means  that 
you  vjould  be  as  often  wrong  as  you  viere   correct  in  prediction  (L)  for 
a  (+)  sign.   A  conditional  probability  of  .78  for  a  (-)  test  sign 
means  that  you  would  be  correct  more  often  than  you  would  be  v/rong 
in  predicting  (S)  for  a  (-)  test  sign. 

Your  task  is  to  try  to  make  the  best  estimate  of  probable  length 
of  stay  in  psychotherapy  given  additional,  but  limited,  information. 
It  is  possible  to  correctly  classify  all  the  profiles.   It  is  hoped 
that  your  predictions  will  in  some  way  help  us  to  understand  one 
aspect  of  tiie  decision-making  process  as  it  is  applied  by  psycholo- 
gists in  clinical  settings. 

Again,  please  indicate  your  confidence  in  your  judgment  for  each 
'subject  from  50  per  cent  to  100  per  cent. 


APPENDIX      B 
SUMFiARY   OF   NEV/MAN-KEULS   TESTS 
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APPENDIX  B-l 
SUMMARY  OF  NEWMAN-KEULS  TEST  FOR  GROUP  MEAN  DIFFERENCES 

Differences  among  Level  I  means 


^GS  h  -SGS 
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-■ 

Xp       =   .58 

isGS  -   '^' 
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Differences  among  Level  II  means 
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— 

Differences  among  Level  III  means 
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X  XX 
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Differences  amonq  Level  IV  means 
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APPENDIX  B-l I 
SUMMARY  OF  NEWMAN-KEULS  TEST  FOR  INFORMATION  MEAN  DIFFERENCES 

Differences  amonq  SGS  means 
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Differences  among  UGS  means 
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the  Dean  of  the  College  of  Arts  and  Sciences  and    to  the  Graduate 
Couiicil,  end  was  approved  as  partial  fulfillment  of  the  require- 
ments for  the  degree  of  Doctor  of  Pliiloscphy. 
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Dean,  College  of  Arts  and  Sciences 


Dean,  Graduate  School 


Supervisory  Committee: 
Chairman         ' 
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